The goals for this assignment are to:
By the end of the assignment you should have several static maps displaying the datasets we’ve used in the last few weeks.
Find 2 examples of maps that you think are ‘bad’.
Question 1 Why are they bad? What might improve them? On the most basic level, these maps are bad because they don’t really mean anything. I struggle to think of any scenario, ever, in which you would need to know how many counties away from the coast a particular county is, when “county” is a totally meaningless and wide-ranging unit of distance. The second map is slightly more interesting, but it’s equally unclear exactly why or when it would matter. I suppose it’s one way of showing that, for many states, less populated areas might be more red? It’s also pretty useless in that the lack of state labels make it hard to even know what you’re looking at. I generally know where all the states are, but with this organization I’d be hard-pressed to differentiate between, say, Indiana and Iowa in any functional or accessible way.
Question 2 Rely on the Healy and Wilke texts to provide some structure to your answers. To expand on the above, it would appear that both of these maps got the science/data right (albeit for fairly uninteresting questions, in my opinion), but got the art really wrong. The color choice for the first map (counties) also doesn’t have a color ramp that goes in one, consistent direction (goes from dark to light and back to dark again via different colors), meaning it would look bad as a black-and-white image and likely would not work well for a color-blind person.
You can choose whichever datasets you’d like from the past several months as the subject for your mapping. You’ll need to use at least one tabular join, one spatial join, and one extraction to create the dataframe. Load the packages, the data, and make sure everything is projected here. Give me a sense for what you are hoping to map.
library(ggplot2)
library(tidyverse)
## Warning in system("timedatectl", intern = TRUE): running command 'timedatectl'
## had status 1
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v tibble 3.1.4 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.1 v forcats 0.5.1
## v purrr 0.3.4
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(pander)
library(dplyr)
library(sf)
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(units)
## udunits database from /usr/share/xml/udunits/udunits2.xml
library(terra)
## terra version 1.3.22
##
## Attaching package: 'terra'
## The following object is masked from 'package:pander':
##
## wrap
## The following object is masked from 'package:dplyr':
##
## src
# install.packages("cartogram")
library(cartogram)
##
## Attaching package: 'cartogram'
## The following object is masked from 'package:terra':
##
## cartogram
# install.packages("ggmap")
library(ggmap)
## Google's Terms of Service: https://cloud.google.com/maps-platform/terms/.
## Please cite ggmap if you use it! See citation("ggmap") for details.
##
## Attaching package: 'ggmap'
## The following object is masked from 'package:terra':
##
## inset
library(tmap)
# install.packages("patchwork")
library(patchwork)
##
## Attaching package: 'patchwork'
## The following object is masked from 'package:terra':
##
## area
library(viridis)
## Loading required package: viridisLite
# regional land values
landval <- terra::rast('/opt/data/session16/Regval.tif')
# mammal richness
mammal.rich <- rast('/opt/data/session16/Mammals_total_richness.tif')
mammal.rich <- catalyze(mammal.rich) # get value layer
mammal.rich <- mammal.rich[[2]]
# PADUS designation lands
pas.desig <- st_read("/opt/data/session04/regionalPAs1.shp")
## Reading layer `regionalPAs1' from data source
## `/opt/data/session04/regionalPAs1.shp' using driver `ESRI Shapefile'
## Simple feature collection with 224 features and 7 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -1714157 ymin: 1029431 xmax: -621234.8 ymax: 3043412
## Projected CRS: USA_Contiguous_Albers_Equal_Area_Conic_USGS_version
# PADUS proclamation lands
pas.proc <- st_read("/opt/data/session16/reg_pas.shp")
## Reading layer `reg_pas' from data source `/opt/data/session16/reg_pas.shp' using driver `ESRI Shapefile'
## Simple feature collection with 544 features and 32 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -2034299 ymin: 1004785 xmax: -530474 ymax: 3059862
## Projected CRS: USA_Contiguous_Albers_Equal_Area_Conic_USGS_version
# get the column names sorted out
colnames(pas.proc)[c(1, 6, 8, 10, 12, 22, 25)] <- colnames(pas.desig)
## Warning in colnames(pas.proc)[c(1, 6, 8, 10, 12, 22, 25)] <-
## colnames(pas.desig): number of items to replace is not a multiple of replacement
## length
gap.sts <- c("1", "2", "3") # good with Gap statuses 1-3
pas <- pas.proc %>%
select(., colnames(pas.desig)) %>%
bind_rows(pas.desig, pas.proc) %>% #select the columns that match and then combine
filter(., State_Nm =="UT" & GAP_Sts %in% gap.sts ) %>% # filter for Utah and gap status
st_make_valid() %>%
st_buffer(., 10000) # add a 10km buffer
# double check that it's all valid
all(st_is_valid(pas))
## [1] TRUE
Build out the dataframe
# get regional boundary area
utah <- tigris::states(cb=TRUE) %>%
filter(STUSPS == "UT")
##
|
| | 0%
|
| | 1%
|
|= | 1%
|
|= | 2%
|
|== | 2%
|
|== | 3%
|
|=== | 4%
|
|=== | 5%
|
|==== | 5%
|
|==== | 6%
|
|===== | 7%
|
|===== | 8%
|
|====== | 8%
|
|====== | 9%
|
|======= | 10%
|
|======== | 11%
|
|======== | 12%
|
|========= | 12%
|
|========= | 13%
|
|========== | 14%
|
|========== | 15%
|
|=========== | 16%
|
|============ | 17%
|
|============= | 19%
|
|============== | 20%
|
|=============== | 22%
|
|================ | 22%
|
|================= | 24%
|
|================== | 26%
|
|=================== | 27%
|
|=================== | 28%
|
|============================= | 41%
|
|============================== | 43%
|
|========================================== | 61%
|
|====================================================== | 77%
|
|======================================================= | 78%
|
|============================================================ | 86%
|
|================================================================= | 92%
|
|================================================================= | 93%
|
|======================================================================| 100%
# make vector data into SpatVectors
pa.vect <- as(pas, "SpatVector")
ut.vect <- as(utah, "SpatVector")
# get everything into the same projection as mammal richness data
pa.vect <- project(pa.vect, mammal.rich) # buffered protected areas
ut.vect <- project(ut.vect, mammal.rich)
land.val.proj <- project(landval, mammal.rich)
# crop raster to the size of the utah vector
mam.rich.crop <- crop(mammal.rich, ut.vect)
ut.val.crop <- crop(land.val.proj, ut.vect)
crs(mam.rich.crop) == crs(ut.val.crop) # true
## [1] TRUE
# check it out
plot(mam.rich.crop)
plot(ut.vect, add=TRUE)
plot(pa.vect, add=TRUE)
# get census data involved
ut.census <- tidycensus:: get_acs(geography = "county",
variables = c(medianincome = "B19013_001",
pop = "B01003_001"),
state = c("UT"),
year = 2019,
key = "19c3ab9bcdc3ec1c81d11d51d790be308678a8d2",
geometry = TRUE) %>%
st_transform(., crs(pa.vect)) %>% # get it in the right crs
select(-moe) %>% # get rid of moe variable
spread(variable, estimate)
## Getting data from the 2015-2019 5-year ACS
## Downloading feature geometry from the Census website. To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
##
|
| | 0%
|
|= | 2%
|
|======= | 10%
|
|======== | 11%
|
|================ | 23%
|
|========================== | 37%
|
|==================================== | 52%
|
|============================================== | 66%
|
|======================================================== | 80%
|
|========================================================== | 82%
|
|=================================================================== | 96%
|
|======================================================================| 100%
# extract and join everything
pa.summary <- st_join(st_as_sf(pa.vect), ut.census, join = st_overlaps) # find overlapping areas
pa.summary <- pa.summary %>%
group_by(Unit_Nm) %>%
summarize(., meaninc = mean(medianincome, na.rm=TRUE),
meanpop = mean(pop, na.rm=TRUE))
# check row numbers
nrow(pa.summary) ==length(unique(pas$Unit_Nm))
## [1] TRUE
# run zonal statistics
pa.zones <- terra::rasterize(pa.vect, mam.rich.crop, field = "Unit_Nm")
mammal.zones <- terra::zonal(mam.rich.crop, pa.zones, fun = "mean", na.rm=TRUE)
landval.zones <- terra::zonal(ut.val.crop, pa.zones, fun = "mean", na.rm=TRUE)
#Note that there is one few zone than we have in our PA dataset. This is because we have an overlapping jurisdiction; we'll ignore that now but it's a common problem with using the PADUS
summary.df <- pa.summary %>%
left_join(., mammal.zones) %>%
left_join(., landval.zones)
## Joining, by = "Unit_Nm"
## Joining, by = "Unit_Nm"
head(summary.df)
## Simple feature collection with 6 features and 5 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -1573126 ymin: 43005.16 xmax: -1158358 ymax: 171195.5
## Projected CRS: USA_Contiguous_Albers_Equal_Area_Conic
## # A tibble: 6 x 6
## Unit_Nm meaninc meanpop geometry Value Regval
## <chr> <dbl> <dbl> <POLYGON [m]> <dbl> <dbl>
## 1 "Ashdown Gorge~ 49426. 29348. ((-1481296 146366.4, -1481291 14~ 79.5 9.78
## 2 "Bears Ears Na~ NaN NaN ((-1220258 103285.2, -1220222 10~ 82.5 8.32
## 3 "Bears Ears Na~ NaN NaN ((-1203663 134683.7, -1203663 13~ 82 8.21
## 4 "Bears Ears Na~ NaN NaN ((-1223019 99560.4, -1223066 999~ 81.2 8.45
## 5 "Beartrap Cany~ 55823 108512 ((-1487848 135977.4, -1487338 13~ NA NA
## 6 "Beaver Dam (N~ 59839 165811 ((-1566805 100404.4, -1566655 10~ NA NA
Practice making a quick map with tmap.
# kind of dumpy version
tm_shape(summary.df) + tm_polygons(col = "meanpop", border.col = "white") +
tm_legend(outside = TRUE)
# more detailed map
tm_shape(mam.rich.crop) +
tm_raster("Value", palette = viridis(n=50), n=50, legend.show=FALSE, legend.hist = TRUE, legend.hist.title = "Species Richness") +
tm_shape(utah) +
tm_borders("white", lwd = .75) +
tm_shape(summary.df) +
tm_polygons(col = "meanpop", border.col = "white", title="Mean Population") +
tm_legend(outside = TRUE)
Your map should have a basemap, should rely on more than one aesthetic (color, transparency, etc), and combine multiple layers.
# simple ggplot mapping polygons by mean income
ggplot(summary.df) +
geom_sf(mapping = aes(fill = meaninc))
# more complex ggplot with ggmap basemap
bg <- ggmap::get_map(as.vector(st_bbox(utah))) # get basemap for correct bounding box/location
## Source : http://tile.stamen.com/terrain/8/46/95.png
## Source : http://tile.stamen.com/terrain/8/47/95.png
## Source : http://tile.stamen.com/terrain/8/48/95.png
## Source : http://tile.stamen.com/terrain/8/49/95.png
## Source : http://tile.stamen.com/terrain/8/50/95.png
## Source : http://tile.stamen.com/terrain/8/46/96.png
## Source : http://tile.stamen.com/terrain/8/47/96.png
## Source : http://tile.stamen.com/terrain/8/48/96.png
## Source : http://tile.stamen.com/terrain/8/49/96.png
## Source : http://tile.stamen.com/terrain/8/50/96.png
## Source : http://tile.stamen.com/terrain/8/46/97.png
## Source : http://tile.stamen.com/terrain/8/47/97.png
## Source : http://tile.stamen.com/terrain/8/48/97.png
## Source : http://tile.stamen.com/terrain/8/49/97.png
## Source : http://tile.stamen.com/terrain/8/50/97.png
## Source : http://tile.stamen.com/terrain/8/46/98.png
## Source : http://tile.stamen.com/terrain/8/47/98.png
## Source : http://tile.stamen.com/terrain/8/48/98.png
## Source : http://tile.stamen.com/terrain/8/49/98.png
## Source : http://tile.stamen.com/terrain/8/50/98.png
## Source : http://tile.stamen.com/terrain/8/46/99.png
## Source : http://tile.stamen.com/terrain/8/47/99.png
## Source : http://tile.stamen.com/terrain/8/48/99.png
## Source : http://tile.stamen.com/terrain/8/49/99.png
## Source : http://tile.stamen.com/terrain/8/50/99.png
ggmap(bg) +
geom_sf(data = summary.df, mapping = aes(fill = meaninc), inherit.aes = FALSE) +
geom_sf(data=utah, fill=NA,color="black", inherit.aes = FALSE) +
coord_sf(crs = st_crs(4326))
## Coordinate system already present. Adding new coordinate system, which will replace the existing one.
# one more time with a scale
ggmap(bg) +
geom_sf(data = summary.df, mapping = aes(fill = Value,
alpha = (Regval - max(Regval, na.rm=TRUE))/(max(Regval, na.rm=TRUE)-min(Regval, na.rm = TRUE))), inherit.aes = FALSE) +
geom_sf(data=utah, fill=NA,color="black", inherit.aes = FALSE) +
scale_fill_viridis(option="magma")+
labs(title="Species Richness and Land Value in Utah Protected Areas", x="Longitude", y="Latitude", fill="Species Richness", alpha="Land Cost Factor") +
coord_sf(crs = st_crs(4326))
## Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Follow the examples to build cartograms that display your region of interest based on variables other than area.
ut_pop <- cartogram_cont(ut.census, "pop", itermax = 5)
## Mean size error for iteration 1: 9.60643980267167
## Mean size error for iteration 2: 8.32188317662873
## Mean size error for iteration 3: 7.39036231258215
## Mean size error for iteration 4: 6.58869987673001
## Mean size error for iteration 5: 5.84424708779424
tm_shape(ut_pop) + tm_polygons("pop", style = "jenks", title="County Population (2019)") +
tm_layout(frame = FALSE, legend.position = c("left", "bottom"))
# get counties
ut.counties <- tigris::counties(state="UT",cb=TRUE)
# try a different population visualization with non-contiguous area
ut_pop_ncont <- cartogram_ncont(ut.census, "pop")
tm_shape(ut.counties) + tm_borders() +
tm_shape(ut_pop_ncont) +
tm_polygons("pop", style = "jenks", title="County Population (2019)") +
tm_layout(frame = FALSE, legend.outside=TRUE)
# cartogram based on county income
ut_inc <- cartogram_cont(ut.census, "medianincome", itermax = 5)
## Mean size error for iteration 1: 2.38752229037587
## Mean size error for iteration 2: 1.72970306617314
## Mean size error for iteration 3: 1.40461081872995
## Mean size error for iteration 4: 1.23123889199607
## Mean size error for iteration 5: 1.13876858930548
tm_shape(ut_inc) + tm_polygons("medianincome", style = "jenks", title="Median Household Income (2019)") +
tm_layout(frame = FALSE, legend.position = c("left", "bottom"))
# different style of cartogram
ut_inc_ncont <- cartogram_ncont(ut.census, "medianincome")
tm_shape(ut.counties) + tm_borders() +
tm_shape(ut_inc_ncont) +
tm_polygons("medianincome", style = "jenks", title="Median Household Income (2019)") +
tm_layout(frame = FALSE, legend.outside=TRUE)
# looks like true trash
Question 3: Reflect on the different maps you’ve made, what do the different visualizations tell you about the data you plotted? Generally speaking, the maps tell me that species richness is higher in southern and western Utah and less-populated places, which are also areas with lower mean median income and lower land values. Species richness is lower around major urban areas with higher populations (namely, the SLC area), which also have higher mean median household income and land values. The maps that combine species richness and demographic data are more useful than the separately-visualized population and income data. That said, the cartograms make income and population more straightforward.
Question 4: How might you improve the maps you’ve made? For the map with the species richness raster under the population by buffered protected areas polygons, I might visualize the data in a way that doesn’t obscure the species richness under the opacity of the population polygons. In that map, the species richness histogram also isn’t completely intuitive, so I would want to adjust that for clarity.
The chloropleth with median income is fine, but is somewhat disrupted by 1) the amount of missing data and 2) the direction of the color gradient. It’s not intuitive (at least to me) for lower income to be a darker color, so I’d probably reverse the color ramp.
The map that shows land values is interesting but difficult to interpret. The different transparency levels are hard to distinguish both on the map and in the legend, and the legend itself does not indicate the point of the transparency levels. Part of the issue with this is the overlap in the buffered protected areas, so maybe there are ways to work around that to improve the visualization.
The cartograms that use a different size/color of county polygon to represent larger or smaller populations/incomes are also aesthetically horrible. The ones that warp the shape of Utah work better, so I would just use those instead of the non-continuous version.
Question 5: Is a map the best way to evaluate the data you worked with? Why or Why not? I think a map is a great way to explore this data and start seeing patterns that might be interesting to explore further. However, the patterns created by overlaying data don’t really say anything about trends, they don’t point toward decision-making options, and they don’t necessarily imply anything about what the patterns mean. What does it mean that protected areas near SLC have higher land values and higher incomes associated? Does that mean anything for conservation values on the ground in those protected areas? What are the species captured by species richness, and would these maps change substantially if we included abundance?
Although the cartograms are more straightforward and might be one of the better ways to visualize and/or evaluate the population and income data, they do not have the level of granularity that might lead to any kind of policy or management decision. For example, the income data do not say anything about income inequality or more granular patterns within counties.